Skip to content

Conversation

@PeimingLiu
Copy link
Member

@PeimingLiu PeimingLiu commented May 20, 2025

It seems that the number of padding value should be computed based on the relative distance between ptr and buffer.begin() (instead of the absolute address of ptr). Otherwise, the skipped padding value will be indeterministic and depends on the initial alignment of buffer.begin()?

Or is there any implicit assumption on the alignment of buffer that holds the bytecode?

See

size_t paddingSize = llvm::alignTo(curOffset, alignment) - curOffset;
for how ByteCoderWriter emits padding values.

It seems that the number of padding value should be computed based on
the offset of the current pointer to the bytecode buffer
begin (instead of the absolute address of `ptr`). Otherwise, the
skipped padding value will be dependent on the alignment of
`buffer.begin()`?
@llvmbot llvmbot added mlir:core MLIR Core Infrastructure mlir labels May 20, 2025
@llvmbot
Copy link
Member

llvmbot commented May 20, 2025

@llvm/pr-subscribers-mlir-core

@llvm/pr-subscribers-mlir

Author: Peiming Liu (PeimingLiu)

Changes

It seems that the number of padding value should be computed based on the relative distance between ptr and buffer.begin() (instead of the absolute address of ptr). Otherwise, the skipped padding value will be indeterministic and depends on the alignment of buffer.begin()?

Or is there any implicit assumption on the alignment of buffer that holds the bytecode?


Full diff: https://github.com/llvm/llvm-project/pull/140660.diff

1 Files Affected:

  • (modified) mlir/lib/Bytecode/Reader/BytecodeReader.cpp (+3-2)
diff --git a/mlir/lib/Bytecode/Reader/BytecodeReader.cpp b/mlir/lib/Bytecode/Reader/BytecodeReader.cpp
index 1052946d4550b..793170dcc594c 100644
--- a/mlir/lib/Bytecode/Reader/BytecodeReader.cpp
+++ b/mlir/lib/Bytecode/Reader/BytecodeReader.cpp
@@ -107,7 +107,8 @@ class EncodingReader {
       return emitError("expected alignment to be a power-of-two");
 
     auto isUnaligned = [&](const uint8_t *ptr) {
-      return ((uintptr_t)ptr & (alignment - 1)) != 0;
+      unsigned offset = ptr - buffer.begin();
+      return (offset & (alignment - 1)) != 0;
     };
 
     // Shift the reader position to the next alignment boundary.
@@ -1506,7 +1507,7 @@ class mlir::BytecodeReader::Impl {
     UseListOrderStorage(bool isIndexPairEncoding,
                         SmallVector<unsigned, 4> &&indices)
         : indices(std::move(indices)),
-          isIndexPairEncoding(isIndexPairEncoding){};
+          isIndexPairEncoding(isIndexPairEncoding) {};
     /// The vector containing the information required to reorder the
     /// use-list of a value.
     SmallVector<unsigned, 4> indices;

@PeimingLiu PeimingLiu requested a review from chsigg May 20, 2025 02:54
@River707
Copy link
Contributor

River707 commented May 20, 2025

The alignment in the reader must be for the actual pointer address, because it gets used to directly load/read data that has alignment requirements. There is an expectation that whoever created the buffer (by alloc or mmap) ensures that the alignment is at least to the amount expected, though in a majority of cases (unless you have buffers with large alignment constraints) it already is given malloc gives you at least max_align_t

Copy link
Contributor

@River707 River707 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Marking RC because this is not the expected reader behavior.

@PeimingLiu
Copy link
Member Author

PeimingLiu commented May 20, 2025

Make sense, that was why I had the question "Or is there any implicit assumption on the alignment of buffer that holds the bytecode?"

at least to the amount expected,

What is the expected amount (alignof(max_align_t))? Also, does it make sense to return an Error when the buffer is not aligned as expected? This is quite hard to dig when error occurs.

I might have missed important pieces, but it still seems somehow unsafe. E.g.,

// In ByteCodeWriter.cpp
Section_1 = {/*size = 16 bytes*/} 
Section_2 = {/*alignment=32 bytes =>*/padding = char[16]}
// In ByteCodeReader.cpp

// content alignof(16) say, ptr = 0x10
Section_1 = {/*size = 16 bytes*/} 
// ptr = 0x20 (already aligned to 32)
Section_2 = {/*pad value loaded as section data*/}.

It is only guaranteed to be safe when the buffer is aligned to the maximum section alignments IIUC.

@PeimingLiu PeimingLiu closed this May 20, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

mlir:core MLIR Core Infrastructure mlir

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants